Searching for Related Descriptors Among Different Datasets: A New Strategy Implemented by the R Package “Dadi”
نویسندگان
چکیده
Background: The increasing number of techniques introduced to describe organisms and taxa produce multivariate datasets, often composed of relatively independent descriptors. Handling several descriptors can be laborious and often unnecessary when their information is not congruent to that of other datasets used in the same study. On the other hand, different levels of correlation between single descriptors to a whole dataset may suggest useful scientific hints. The DADI (Distance-based Analysis for (optimal) Descriptor Identification) algorithm is proposed to allow a rapid and complete analysis among descriptors coming from two different datasets with the same number of objects. DADI was employed to select FTIR (Fourier Transform Infrared Spectroscopy) spectral wavelengths according to their correlation with the 26S rDNA sequences of strains belonging to a yeast genus. Results: This procedure allowed to define a set of optimal wavelengths with an overall increase of the correlation between FTIR and 26S data. Conclusions: DADI can identify the FTIR wavenumbers best fitting to the chosen reference defining the descriptors to be used in FTIR and possibly in other metabolomic analyses.
منابع مشابه
A New Development of Electronic Descriptors for Simulation of 13C Chemical Shifts
The 13C chemical shifts for a series of compounds which includes carboxylic acids, aldehydes, ethers, ketones and hydrocarbons were simulated by using the parametric techniques. The observed chemical shifts were related to numerically encoded structural parametes called descriptors. Two new electronic descriptors were added to the previous descriptors. Ploting of experimental ver...
متن کاملnetgwas: An R Package for Network-Based Genome-Wide Association Studies
Graphical models provide powerful tools to model and make the statistical inference regarding complex relationships among variables in multivariate data. They are widely used in statistics and machine learning particularly to analyze biological networks. In this paper, we introduce the R package netgwas which is designed for accomplishing three important, and inter-related, goals in genetics: l...
متن کاملElastic constants and their variation by pressure in the cubic PbTiO3 compound using IRelast computational package within the density functional theory
p.p1 {margin: 0.0px 0.0px 0.0px 0.0px; text-align: justify; font: 12.0px 'Times New Roman'} span.s1 {font: 12.0px 'B Nazanin'} p.p1 {margin: 0.0px 0.0px 0.0px 0.0px; text-align: justify; font: 12.0px 'Times New Roman'} span.s1 {font: 12.0px 'B Nazanin'} In this paper, we study the structural and electronic properties of the cubic PbTiO3 compound by using the density functional the...
متن کاملPredicting stock prices on the Tehran Stock Exchange by a new hybridization of Fuzzy Inference System and Fuzzy Imperialist Competitive Algorithm
Investing on the stock exchange, as one of the financial resources, has always been a favorite among many investors. Today, one of the areas, where the prediction is its particular importance issue, is financial area, especially stock exchanges. The main objective of the markets is the future trend prices prediction in order to adopt a suitable strategy for buying or selling. In general, an inv...
متن کاملPharmacoGx: an R package for analysis of large pharmacogenomic datasets
UNLABELLED Pharmacogenomics holds great promise for the development of biomarkers of drug response and the design of new therapeutic options, which are key challenges in precision medicine. However, such data are scattered and lack standards for efficient access and analysis, consequently preventing the realization of the full potential of pharmacogenomics. To address these issues, we implement...
متن کامل